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Abstract 

This  paper  examines  several  line-drawing  pattern  recognition  methods  for  handwritten 
character  recognition  such  as  picture  descriptive  language(PDL),  Berthod  and  Maroy's 
method(BM),  extended  Freeman's  chain  code(EFC),  error  transformation(ET)method, 
tree  grammar(TG),  and  array  grammar(AG).  Then  a  new  character  recognition  scheme 
that  uses  improved  extended  octal  codes  as  the  primitives  is  introduced.  This  scheme  of- 
fers advantages  of  handing  flexible  size,  orientation,  and  variations,  need  for  fewer  learning 
samples  needed  and  lower  degree  of  ambiguity.  Finally  the  simulation  of  off-line  character 
recognition  by  their  real  time  on-line  counterpart  is  investigated. 

Keywords  :  pattern  recognition,  character  recognition,  syntactic  approach,  structural 
approach,  grammars,  on-line  vs  off-line,  learning,  degrees  of  recognizability,  learnability 
and  ambiguity,  artificial  intelligence 

1.  Introduction 

During  the  past  quarter  century  there  has  been  a  growing  interest  for  computer  sci- 
entists, engineers  and  researchers  in  the  problems  related  to  the  replication  of  human 
function  and  machine  simulation  of  human  reading.  Intensive  research  has  been  done 
in  this  area  and  a  large  number  of  technical  papers  and  reports  on  line-drawing  pattern 
recognition  (including  character  recognition)  can  be  found  in  the  literature[l,2,3,8,10,12- 
16,23-25,27-32].  This  subject  has  attracted  such  an  immense  interest  and  effort  not  only 
because  it  is  very  interesting  and  challenging,  but  also  because  it  provides  a  strategy  for 
automated  postal  code  reading,  check  verification,  office  automation  and  a  large  variety 
of  banking,  business,  industrial,  engineering  and  scientific  applications[8,25,27]. 

Recently,  the  state  of  the  art  in  line-drawing  pattern  recognition  has  advanced  from 
the  use  of  primitive  schemes  for  the  recognition  of  machine-printed  alphanumerals  to  the 
application  of  sophisticated  techniques  for  the  recognition  of  a  wide  variety  of  handwrit- 


ten  characters  and  symbols  [1,27,32,34]. 

It  is  interesting  to  see  that  even  human  beings,  who  possess  the  best  trained  optical 
reading  devices  (eyes),  make  about  5-percent  mistakes  when  reading  in  the  absence  of 
context [24].  These  errors  are  mainly  caused  by  infinite  variations  of  shapes  resulting  from 
the  writing  habit,  styles,  education,  region  of  origin,  social  environment,  mood,  health 
and  other  conditions  of  the  writer.  For  character  recognition  by  computers,  the  error  rate 
could  be  even  higher  because  of  the  reasons  described  above  as  well  as  other  factors  such 
as  the  writing  instrument,  writing  surface,  scanning  mechanisms,  learning  techniques  and 
machine  recognition  algorithms[3,8]. 

This  paper  begins  with  an  analysis  and  comparison  of  several  line-drawing  pattern 
recognition  methods.  These  approaches  can  be  basically  classified  into  two  categories, 
namely  (1)  syntactic  approach  (such  as  PDL,  tree  grammars  and  array  grammars)  and  (2) 
structural  approach  (such  as  BM  method  and  extended  Freeman's  chain  code  method). 
Their  advantages  and  disadvantages  are  discussed.  The  reason  we  concentrate  on  syn- 
tactic and  structural  methods  rather  than  conventional  statistical  methods,  is  because 
they  are  hierarchical  in  nature,  i.e.  they  divide  a  large,  complicated  pattern  into  smaller 
subpatterns  according  to  its  structure,  which  in  turn  can  be  further  divided  into  smaller 
subpaterns  and  so  on,  till  primitives  are  found  which  can  no  longer  be  divided.  Using 
such  structural  and  syntactic  approaches,  one  can  directly  take  advantages  of  the  power- 
ful data  structures  using  grammatical  rules,  trees  and  directed  labeled  graphs,  which  are 
widely  used  in  dealing  with  linguistic  problems  [6,7,11,  18,20,21].  Such  approaches  are 
similar  to  the  powerful  artificial  intelligence  problem  solving  str  tegies  in  which  a  large 
or  complicated  problem  is  reduced  to  smaller  subproblems  [17,19,26,35,36].  Further,  it 
has  been  shown  that  syntactic  and  structural  approaches  can  overcome  some  disadvan- 
tages found  in  classic  decision  theoretic  (statistical)  approach,  which  has  difficulty  in 
distinguishing  between  two  very  similar  patterns  (characters)  [6,18].  Among  the  various 
structural  and  syntactic  approaches,  we  believe  that  those  with  a  higher  degree  of  rec- 
ognizability  are  normally  more  natural  in  that  they  are  more  accurate  and  closer  to  the 
intuition  of  human  beings,  who  possess  the  best  pattern  recognition  skill  for  dealing  with 
handwritten  symbols. 


A  new  character  recognition  scheme  using  improved  extended  octal  code  as  primitives 
is  discussed.  This  scheme  provides  certain  advantages  such  as  flexible  size,  orientation, 
variations,  need  for  fewer  learning  samples  and  lower  degree  of  ambiguity  and  higher  de- 
gree of  recognizability.  Finally,  an  off-line  character  recognition  technique  that  simulates 
on-line  real  time  techniques  is  studied. 


2.   Notations,  Terminologies,  Definitions,  and  Fundamentals 

Frequently  we  hear  people  say,  "  Your  handwriting  is  terrible.  It  is  hard  to  recognize." 
But  what  constitutes  a  well  recognizable  or  a  hardly  recognizable  handwriting  at  the 
level  of  character,  word  or  even  a  sentence?  Looking  at  Table  2.1,  one  can  see  that 
category  1  is  recognizable,  category  2  is  recognizable  with  some  effort,  category  3  is 
recognizable  with  great  difficulty  (and  perhaps  with  a  high  risk  of  mis-recognition)  and 
category  4  is  beyond  recognition.  While  this  is  intuitively  true,  it  is  necessary  to  define  a 
more  concrete  criterion  (or  criteria)  of  "recognizable"  characters  for  machine  simulation 
of  human  reading.  Figure  2.1  shows  a  block  diagram  of  a  typical  character  recognition 
scheme. 

Definition  2.1  The  degree  of  ambiguity  of  a  character  B  can  be  defined  in  terms  of  the 
problem  domain  D  and  the  encoding  scheme  E  as  follows  : 

DegAmb(B,D,E)=max.  no.  of  interpretations  of  B  in  D  under  E. 

Definition  2.2  The  degree  of  recognizability  of  a  character  B,  in  the  domain  of  D, 
under  the  encoding  scheme  E  can  be  defined  as  follows: 


DegRec{B,D,E)  =  < 


l/DegAmb(B,D,E)    if  B  is  in  the  dictionary 
0  otherwise 


Conventionally  one  would  conjecture  that  the  easier  a  character  is  to  learn  the  easier  it 
is  to  recognize  it  and  vice  versa.  However,  this  is  not  always  true.  We  do  find  examples 


T*bl«  2.1 

S«T«raI  lavala   of  racognizability 


The  party  begins. 

^^  <£A^v-*-  <f(S&u*-<J?  (ZCSi^Ji. 
2  drinks  later. 


£&~-   dj^^r^i_   6JZ^_     J  j^. 


^c 


t+<£*W  <€^Sr. 


C-H.  £ 


/s 


->l  Input  1- 
1  Data  1 

->l  Pre-processing: 
1  Feature  Extraction 
1  loite  Elimination 
1  Data  Compression 
1  Segmentation 

— >l  Dictionary 
1  Construction- 

— >l  Decision-Tree 
1  Construction 

Learning 

Recognition 

! 

1 

->l  Input  1- 

->|   Pre-processing 

— >|  Dictionary  I— 
1  Hatching   1 

->l  Decision-Tree  1 
1  Hatching      1 

Figure  2.1  Block  diagram  of  a  typical  recognition  scheme. 


Figure  2.2  Continuous  transformation  between  I  and  Y,  C  and  0. 
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Figure  2.3  Some  more  examplee  of  continuous  trans format ion 


tuitr  to  learn  < >  easitr  to  recognize 


harder  to  learn  < >  harder  to  recognize 

i 

Figure  2.4    Relation  between   'learning'   and   'recognition' 


which  are  easier  to  learn  but  harder  to  recognize  and  vice  versa.  Take  '0'  and  'Q'  for 
instance.  While  '0'  is  comparatively  easier  to  learn  (syntactically)  than  'Q',  it  is  harder 
to  recognize  (semantically).  Notice  that  Definitions  2.1-2.2  are  valid  not  only  for  well 
behaved  printed  letters,  but  also  for  handwriting  variations.  For  instance,  in  Figure  2.2, 
DegAmb(X,D,E)>  2  and  DegAmb(0,D,E)>  2. 

For  more  examples,  please  see  Figure  2.3.  In  general,  there  are  8  possible  relations  be- 
tween learning  and  recognition  as  shown  in  Figure  2.4. 

3.   Syntactic  Methods:  PDL,  Tree  Grammar  and  Array  Grammar 


This  section  investigates  several  syntactic  methods  for  line-drawing  pattern  recogni- 
tion, namely,  picture  descriptive  language,  tree  grammar  and  array  grammar,  with  an 
analysis  and  comparison  between  them. 

S.l.  The  picture  descriptive  language  PDL. 

The  picture  descriptive  language  (PDL)  was  developed  by  Shaw[7,20,21].  Each  prim- 
itive is  labeled  at  two  distinguished  points,  a  tail  and  a  head.  A  primitive  can  be  linked 
or  concatenated  to  other  primitives  only  at  its  tail  and/or  head. 


Primitive   Structure  Interpretation  Graph 

Description 

a+b  head(a)   CAT  tail(b) 


axb 


a-b  head(a)    CAT   head(b) 

t  h 

a*b  (tail(a)    CAT   tail(b))  a 

(head(a)    CAT  head(b))  t    ^         I_*    h 

b 


The  grammar  that  generates  sentences  in  PDL  is  a  context-free  grammar 

G=(Vn,  Vt,  P,  S) 
where       Vn={S,Sl},  Vt={b,+,X,-,*,~,/,(,),l} 

b  may  be  any  primitive  (including  the  "null  point  primitive"  c,  which  has  identical  tail 
and  head)  and  P: 

S  — >b,  S — >STS,  S— ^~S,  S— SL, 

S — >/SL,  SL — >S,  SL — >SLTSL,  SL — ►  ~SL, 

SL— -»/SL,  T^  +,  T—  x,  T— »  -, 

T— -»  *. 

The  primitives  can  be  as  follows: 

hi  -  h2  — -  gl  \  g2   \  g3 


dl  /  d2   f  d3     /  vl  T  v2    I  v3 


Then,  we  can  express  the  English  characters  as  follows[6]. 

A-  (d2+((d2+g2)»h2)+g2) 

B—  ((v2+((v2+h2+gl+(~(dl+vl)))*h2)+gl+(~(cll+vl)))*h2) 

C-»  (((~gl)+v2+dl+hl+gl+(~vl))x(hl+((dl+vl)  xA))) 

D^  (h2*(v3+h2+gl+(~(dl+v2)))) 

E—  ((v2+((v2+h2)xhl))xh2) 

F—  ((v2+((v2+h2)xhl))xA) 

G—  (((~gl)+v2+dl+hl+gl+(~vl))x(hl+((dl+vl+vl-hl)xA))) 

H^  (v2+(v2x(h2  +  (v2x(~v2))))) 

I—  (v3xA) 

J^  ((((~gl)+vl)xhl)+((dl+v3)xA) 

K—  (v2  +  (v2xd2xg2)) 

L—  (v3xh2) 

M-+  (v3+g3+d3+(~v3)) 

N-»  (v3+g3+(v3xA)) 

O—  (hl*((~gl)+v2+dl+hl+gl+(~(dl+v2)))) 

P-  ((v2+((v2+h2+gl+(~(dl+vl)))*h2))xA) 

Q->  (hl*((~gl)+v2+dl+hl+gl+(~dl+((~gl)xgl)+v2)))) 

R-  (v2+(h2*(v2+h2+gl+(~(dl+vl)))+g2) 

S-»((((~gl)+vl)xhl)+((dl+vl+(~(gl+hl+gl))+vl+hl+gl+(~vl))xA)) 

T—  ((v3+(hlx(~hl)))xA) 

U-  ((((~gl)+v3)xhl)+((dl+v3)xA)) 

V—  ((~g3)xd3xA) 

W^  (((~g3)+d3+g3)+(d3xA)) 

X-  (d2+((~g2)xd2xg2)) 

Y-  ((v2+((~g2)xd2))xA) 

Z—  ((d3-h2)xh2) 
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For  example,  the  character  "A"  can  be  described  as  follows  :  verbatim 


I     \ 

I  \ 


I  \ 


Triangle        +        c 


\ 


*        a 


b     +     c 

3.2.  Tree  Grammars[6,ll,20] 

By  extension  of  one- dimensional  concatenation  to  multidimensional  concatenation, 
strings  are  generalized  to  trees.  Naturally,  if  a  pattern  can  be  conveniently  described  by 
a  tree,  it  can  be  easily  generated  by  a  tree  grammar. 

Let  N+  be  the  set  of  strictly  positive  integers.  Let  U  be  the  universal  tree  domain  (the 
free  semigroup  with  identity  element  "0"  generated  by  N+  and  a  binary  operation  "."). 
The  depth  of  a6  U  is  denoted  d(a)  and  defined  as  :  d(0)=0,  d(a  .  i)=d(a)+l,  ie  N+ .  a 
<  b  if  and  only  if  there  exists  x  6  U  such  that  a  .  x=b  .  a  and  b  are  incomparable  if  and 
only  if  a  ^  b  ,andb  £  a.  Figure  3.1  represents  the  universal  tree  domain  U.  D  is  a  tree 
domain  if  and  only  if  D  is  a  finite  subset  of  U  satisfying(l)  b  €  D  and  a  <  b  implies  a  G 
D,  and  (2)a  .  j  £  D  and  i  <  j  in  N+  implies  a  .  i  €  D. 

A  ranked  alphabet  is  a  pair  (£  >  r)  where  £)  is  a  finite  set  of  symbols  and 

r  :£  — ►  N  =  N+  U  {0} 
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For  a  6  X!,  r(a)  is  called  the  rank  of  a.  Let 

£  =  '-» 

n 

A  tree  over  £  (i-e-->  over  (5Z,r))  is  a  function 

o 
/     I     \ 

12  3 

/        \ 
1.1  1.2    ... 

/      \ 
1.1.1      1.1.2    .. . 


Figure   3.1     Universal  tree  domain 

such  that  D  is  a  tree  domain  and 

r[a(a)]  =  max{i  \  a.i  6  D} 

that  is,  the  rank  of  a  label  at  "a"  must  be  equal  to  the  number  of  branches  in  the  tree 
domain  at  a.  The  domain  of  a  tree  a  is  denoted  by  D(a)  or  Da.  Let  l\p  be  the  set  of  all 
trees  over  ^.  The  depth  of  a  is  defined  as  d(a)=max{d(a)|a  6  D(q)}.  Let  a  be  a  tree 
and  "a"  be  a  member  of  D(a).  a/a,  a  subtree  of  a  at  "a",  is  denned  as 

a/a  =  {(b,x)  |  (a.b,x)  G  a} 
For  example,  please  refer  to  Fig.  3.2. 

GA  =  (VA,rA,PA,A) 

12 


where 


Pa: 


VA  =  {A,Al,A2,Na,Nd,Ne},VTA  =  {i,a,c,d} 
rA{i)  =  {2},  rA(a)  =  {0,  2},      rA(c)  =  {0, 1},      rA{d)  =  {0} 


l 
A      -->      /    \ 

11        A2 

$ 
a  /   \ 

Al   — >      /   \  a  /        \   c 

Ha        Hb  / \ 

a  /        d        \   c 
c  /  \ 

A2   — >         I 
Ic 

Ba  -->   a,    Id  — >   d,    He   -->   c 

Figure  3.2  A  tree  grammar  for  character   "A" 

3.3.  Array  Grammar( AG)[4,20,22,33] 
Definition  3.1 

An  isometric  array  grammar  is  a  quintuple 

G  =  (VN,VT,P,S,#) 

where  Vff  is  a  finite  nonempty  set  of  nonterminal  symbols,  Vj  is  a  finite  nonempty  set  of 
terminal  symbols,  Vr  n  Vj  =0  and  #  ^  (Vjv  U  Vj>)  is  the  background  or  blank  symbol.  P 
is  a  finite  nonempty  set  of  rewriting  rules  of  the  form  a — »/3,  where  array  a  and  array  /3 
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are  geometrically  identical  over  V)v  U  Vy  U  { # }  :  a  not  all  #'s.  SgN  is  the  starting  symbol. 

We  say  that  array  x  directly  generates  array  y,  denoted  x=>  y,  if  there  is  a  rewriting  rule 
a — >/3,  x  contains  a  as  a  subarray  and  y  is  identical  to  x  except  that  the  subarray  a  is 
replaced  with  the  corresponding  symbols  of  the  array  /3.  Let  =^>*  be  the  reflexive  tran- 
sitive closure  of  =*.  The  language  generated  by  an  array  grammar  G,  denoted  L(G),  is 
the  set  of  all  arrays  of  terminal  symbols  and  #'s  that  can  be  generated  from  the  starting 
symbol  S  in  a  field  of  #'s. 

Definition  3.2 

The  distance  between  two  arrays  X  and  Y  over  Vt,  denoted  as  d(X,  Y)  can  be  defined 
as  the  smallest  number  of  error  transformations  required  to  derive  Y  from  X. 

Example  3.1:  Given  two  arrays  over  a,  b 

b 

aaaaa  aaaa 

a  a 

X   =  a  and  Y  =  a 

a  a 

a  b 

we   have 

b  b  b 

aaaaa     aaaaa  a   aaa        a   aaa 

a     Ti      a  Td  a  Ts         a 

X  =     a      |-      a  I-  a  I-         a      =  Y 

a  a  a  a 

a  a  a  b 
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where  the   symbol    I-  means   "becomes". 

The  minimum  number  of  error  transformations  required  to  transform  X  to  Y  is  3.  There- 
fore, d(X,Y)  =  3. 


*********  *  *   *    *    *  ****** 

*  *  * 

*  *  * 

*  *  * 

*  *  *   *   * 

*  *  * 

*  *  * 

*  *  *   * 

Figure  3.3  Three  arrays  over  '«*" 

Definition  3.2  is  not  very  satisfactory  as  can  be  seen  from  the  following  example. 

Example  3.2  Given  the  3  arrays  in  Fig.  3.2,  it  can  be  seen  that  d[(a),  (b)]  =  18  and 
d[(a),  (c)]  =  6.  Therefore  one  would  classify  (a)  and  (c),  instead  of  (a)  and  (b),  into  one 
cluster.  Anyone  who  knows  English  will  easily  recognize  that  (a)  is  more  similar  to  (b) 
rather  than  to  (c).  In  fact,  a  survey  of  18  persons  showed  that  all  of  them  thought  that 
(b)  is  a  T  while  (c)  is  probably  an  F  or  an  E. 

The  above  example  shows  that  the  direct  distance  measure  between  two  arrays  is  not 
satisfactory  for  clustering  analysis.  This  motivates  trial  of  the  following  method. 
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Assume  that  every  array  R  is  a  variation  or  a  noisy  counterpart  of  a  pure,  noiseless 
array  3,  generated  by  an  array  grammar  Gc,  called  'core  grammar'.  If  additional  rules 
are  augmented  to  Gc,  resulting  in  Ga,  called  'augmented  grammar',  then  Ga  can  also 
generate  9ft.  For  instance,  in  Fig.  3.3,  why  does  a  person  recognize  (b),  instead  of  (c)  as  a 
T?  Because  the  person  must  have  learned  this  to  be  a  T  before,  and  one  can  assume  that 
a  typical,  pure  or  noiseless  T  was  used  as  a  model  when  he  or  she  learned  it.  A  description 
of  this  model  for  T  could  be  :  "T  has  a  horizontal  stroke  with  a  vertical  stroke  attached 
below  the  center  of  the  horizontal  ones,  and  both  strokes  are  of  similar  length".  In  other 
words,  when  a  T  is  learned,  it  is  learned  as  if  there  were  an  array  grammar  built  into 
one's  brain.  This  grammar  can  be  considered  as  a  core  grammar  characterizing  T.  With 
a  certain  kind  of  flexibility,  that  is  with  some  extra  rules  attached  to  this  core  grammar 
resulting  in  a  new  augmented  grammar,  one  can  also  recognize  some  variations  or  noisy 
T's,  such  as,  I  ,  J  and  I  . 

Definition  3.3 

The  distance  between  an  array  8?  and  a  group  of  arrays  characterized  by  an  augmented 
array  grammar  G*  is  da  (Ga  ,8?)  =  minimum  number  of  8?  symbols  not  covered  by  the 
parsing  of  Ga. 


Definition  3.4 

The  distance  between  an  array  8?  and  a  group  of  arrays  characterized  by  a  core  array 
grammar  Gc  is  d,.  (Gc  ,  8?)  =  minimum  number  of  3?  parsing  steps  using  non-core  rules 
ofGa  +  da  {Ga,  »). 

Notice  that  if  there  does  not  exist  a  parsing  sequence  for  3?,  then  both  its  da  and  de 
values  are  either  infinite  or  undefined.  In  Fig.  3.2,  for  instance, 


da[D3,(a)}  =  0,  <*,[(?,,  (a)]  =  0,da[G3,{b)]  =  0, 
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de[G2,(b)}  =  9,da[G3,(c)}  =  -,de(G2,(c))  =  -. 

We  now  propose  an  algorithm  that  will  produce  an  augmented  grammar  from  a  core 
grammar. 

Without  loss  of  generality,  one  can  assume  that  each  core  grammar  is  in  Chomsky 
2-point  normal  form  [4,9],  Notice  that  because  a  different  definition  of  connectedness  is 
being  used,  we  have  a  slightly  different  Chomsky  normal  form,  in  that  every  rule  obeys 
one  of  the  following: 


#  C  #  C  #  C 

A#   ->  BC,  A      ->  B      ,  ->  A   ->        B, 

A  B 

A  B  A  B 

#A   ->   C   B,  #  ->    C  ->         ,  A        ->   B 


*  C  #  C 


or  A   ->a 


where  A  G  VN,  B,C  G  VN  U  VT  and  a  G  VT. 

In  general,  a  rule  can  be  represented  by  a  triple  (A,  BC,3),  where  0  <  d  <  7.    For 
instance,  A#  ->  BC  can  be  represented  by  (A,  BC,0), 

#  C 

A  — >  B 

can  be  represented  by  (A,  BC,  1),  etc.,  and  A  -»  a  is  represented  by  (A,  a,  -).   Notice 
that  this  representation  coincides  with  the  Freeman's  chain  code  octal  primitives  as  to 
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be  shown  in  Fig.  4.2.  of  next  section. 

Algorithm  Ai( Core-to-augmented) 
Input,  k  core  grammar 

Gi  =  {VN.>VTi,Pi,Si,#),i=l...k 
Output,  k  augmented  grammars 

Step  1.  For  i=l  to  k  do  step  2 
Step  2. 

and 

P?  =  Pi ;U  {(A,BC,(d+  l)mod6),(A,BC,(d+  7)ntM)  |  (A,BC,d)  €  Pi} 

For  convenience,  if  in  Pi  the  jth  rule  is  (A,  BC,  d),  then  add  j.(d+ l)mo<i8  and  j.  (d+7)m<xf8 
into  Pf. 
Step  3. 

We  are  now  ready  to  propose  a  two-pass  clustering  procedure. 

Cluster  Pass  I 

Input.  A  set  of  n  digitized  patterns  X={x\,Z2,  ...,i„},  and  a  set  of  k  core  grammars  Z 

=  {G1,G2,  ...,Gk},  where  L(Gi)  (1  L{Gj)  =  0  for  i  /  j,  i,  j=l,...,k. 

Output.  The  assignment  of  Xi,  i=l,...,n  to  k  clusters  characterized  by  Gi,  i=  1,  ...,  k. 

Step  1.  Call  CORE-TO-AUGMENTED 
Step  2.  For  i  =  1  to  n  do  step  3-5 
Step  3.  For  j  =  1  to  k  do  step  4-5 
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Step  4.  Compute  da(Gj,Xi) 

Step  5.  Assign  z,  to  cluster  p  with 

da(Gp,Xi)  -  minj=1...kda(Gj,Xi). 

Notice  that  if  more  than  one  clusters  have  the  same  minimum  distance  with  xt,  then 
H  should  be  assigned  to  all  these  clusters. 

If  i,  has  been  assigned  to  more  than  one  cluster,  then  use  Cluster  Pass  II. 

Cluster  Pass  II 

For  all  Xi1,XiJ,...,Xirn  which  are  assigned  to  more  than  one  cluster,  do  the  following. 
Input.  X{i  and  its  assigned  grammars  Gi,,  ...,Gik  where  1  <  j=m  <  n,  1  <  h  <  k,  1<  ij  < 
n,  and  1<  t\  <  k  . 

Output.  Assignment  of  sc,-.  to  a  proper  cluster. 

Step  1.  Compute  dc(GiK,Xii)  for  all  G,-4  to  which  Xi-  has  been  assigned. 
Step  2.  Assign  z1;  to  cluster  iq  where  <fc(Gif,i,y  =  min  de  (Gik,*i,)  for  all  Gik  to  which 
X{-  has  been  assigned. 

The  set  of  English  characters  adapted  from  Fu  and  Lu[7]  is  illustrated  using  the  above 
two-pass  clustering  procedure.  There  are  51  characters  from  nine  different  classes:  P,  X, 
D,  U,  F,  Y,  K,  H,  and  V  characterized  by  GO,  Gl,  G2,  G3,  G4,  G5,  G6,  G7,  and  G8, 
respectively.  Eight  of  the  nine  classes  are  selected  from  four  groups,  each  with  similar 
structures;  that  is  D  and  P,  H  and  K,  U  and  V  and  X  and  Y.  The  class  of  character  F 
is  different  from  the  other  eight  classes.  Each  character  is  an  array  on  a  20*20  grid,  as 
shown  in  Fig.  3.4.  Generated  in  a  top  down  fashion,  each  pattern  is  parsed  upwards  from 
the  bottom.  The  results  of  9  clusters  are  satisfactory  as  shown  in  [29]. 


In  Fu  and  Lu[7],  the  use  of  a  similarity  measure  using  tree  grammar  for  syntactic 
patterns  in  terms  of  grammar  transformations  is  proposed  and  a  nearest  neighborhood 
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43        46        47         48 


49 

30 

31 

•    V         : 

:.-• 

Figure  3.4.  51  digitized  English  characters. 
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rule  and  a  clustering  procedure  are  described.  This  clustering  procedure  is  quite  success- 
ful for  two-dimensional  pattern  analysis,  especially  for  English  characters.  However,  in 
this  procedure,  every  character  should  be  first  encoded  into  a  one- dimensional  string  via 
Freeman's  chain  code[5]  and  Shaw's  PDL[21].  Further,  pattern  47^(Figure  3.4.)  is  not 
accurately  classified  into  group  2,  D,  even  when  weighted  transformations  are  used  with  a 
threshold  value  as  high  as  3.  In  Lu[ll],  a  tree-to-tree  distance  measurement  algorithm  is 
proposed  and  is  applied  to  clustering  analysis  for  2-dimensional  patterns.  Again,  in  this 
algorithm,  every  pattern  should  be  first  encoded  into  a  tree  representation  and,  further, 
there  are  some  confusions  between  As{f\)  and  E$(  £  )  and  between  £<(/  )  and  C${r  ). 

Please  note  that  although  the  purposes  of  [29]  and  [7,11]  are  somewhat  similar,  in  the 
sense  that  they  all  ftim  at  the  same  target  (i.e.  they  all  try  to  solve  the  clustering  problem 
for  two-dimensional  patterns),  [29]  employs  an  entirely  different  approach.  For  instance, 
the  model  used  in[29]  is  "isometric  array  grammar" — the  first  time  such  a  model  has  been 
tried  for  two-dimensional  pattern  clustering  analysis.  Also,  the  definition  of  distance  be- 
tween two  patterns  introduced  in  Section  3  of  this  paper  does  not  use  the  conventional 
error  transformation  method.  Instead,  we  use  the  concept  of  parsing  and  define  the 
distance  between  an  array  and  a  class  of  arrays  characterized  by  a  Context-Free  Array 
Grammar(CFAG)[4].  Even  though  the  same  input  data  adapted  from  [7,11]  is  used  to 
test  our  two-pass  clustering  algorithm,  we  obtain  a  different  result.  In  addition  to  the 
differences  discussed  above,  we  also  notice  that  pattern  7  *j  of  Fig.  3.4  is  classified  as  an 
X  through  our  algorithm,  in  contrast  to  an  F  through  the  algorithm  of  Fu  and  Lu[7,ll]. 
A  survey  of  18  persons  showed  that  7  of  them  thought  of  'Tas  an  X,  8  of  them  thought 
it  was  an  F,  while  3  of  them  thought  it  was  neither.  Clearly  it  depends  on  how  one  is 
trained  to  write  and  the  array  grammar  depends  on  that  training. 

4.  Structural  Methods:  BM  approach  and  Extented  Freeman  Approach[EF] 

a.   Berthod  and  Maroy's  primitives. 

We  denote  Berthod  and  Maroy's  primitives  [3]  and  encoding  scheme  as  BM.  Every 
character  is  encoded  into  a  string  of  the  5  primitives  shown  in  Table  4.1. 
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T:  straight  line 

P:  positive  curve 
(counterclockwise) 

M:  minus  curve 
(clockwise) 

L:  penlift 


<=- 


P 


J- 


R:  cusp 

Table  4.1  The  five  Berthod  and  Maroy  primitives 


Examples  of  codes  corresponding  to  various  characters  are  shown  in  Figure  4.1. 
Let  the  domain  D  be  all  English  characters.  A  dictionary  consisting  of  about  26  charac- 
ters (strings  of  primitives)  resulting  from  pre-processing  of  characters  drawn  on  a  graphic 
tablet  is  compiled. 


Code 


Symbols 


Code 


Symbols 


M 

0 

TM 

S 

MLT 

J 

TLTTT 

M 

MM 

M 

TRM 

DP 

P 

CLU 

TRLMLT 

AF 

PLT 

EGQ 

TRMLTLT 

E 

PLTLT 

E 

TRMM 

B 

PM 

S 

TRMRM 

BR 

PRT 

G 

TRBT 

R 

22 


PT 

G 

TRTTLT 

AFK 

TLM 

DPJ 

TRTTLTLT 

E 

TLMM 

BR 

TRTTT 

N 

TLMRM 

B 

TRTTT 

M 

TLMT 

R 

TT 

LV 

TLT 

X  YT 

TTLT 

J 

TLTLT 

AFHIKNYZ 

TTLTLT 

E 

TLTLTLT 

E 

TTT 

ZNS 

TLTT 

N 

TTTT 

WM 

TLTTM 

B 

Table  4.2  A  dictionary  of  26  English  characters  using  BM 


Now,  if  the  domain  D  is  reduced  to  capital  English  characters  only,  the  dictionary  will 
contain  13  ambiguous  words  (Table  4.2).  The  most  ambiguous  word  is  TLTLT,  which 
has  8  different  interpretations.  Therefore  DegAmb(BrD,BM)=8,  Be{A,F,H,I,K,N,Y,Z}. 
The  code  P  has  3  interpretations  (C,L,U),  TLT  has  3  interpretations,  and  TTT  has  3 
interpretations  (ZNS),  etc. 

It  is  seen  that  characters  A,F,H,I,K,N,Y  and  Z,  which  are  not  even  syntactically  sim- 
ilar, all  result  in  the  same  encoded  word  "TLTLT".  That  makes  this  word  extremely 
ambiguous  and  non-informative.  A  tremendous  amount  of  effort  must  be  spent  to  dis- 
ambiguate it.  Besides,  obviously  unacceptable  symbols  such  as 


Ift   " 

"  \    /  \   t " 

\/  1  \ 

\/     \/ 

A  1   \ 

A       A 

;    \ 

/     W     V 
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TTTT 


TRMLT 


TRTT 


PRT 


Figure  4.1  Examples  of  codes  corresponding  to  characters. 


4  « 


*  0 


Figure  4.2  Freeman's   octal   chain  code  primitives 


Figure  4.3.   (a)  Coded  word  of  "C"  is  34567011  and  (b)  Coded  word  of  "H"  is  6  16 


30. 
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could  be  recognized  as  an  "E",  since  both  have  "TTLTLT",  which  is  in  the  dictionary 
and  is  unambiguous. 

b.   Extented  Primitives 

We  adapt  Freeman's  chain  code[5]  and  the  8  direction  vectors  shown  in  Figure  4.2 
as  primitives.  Each  character  is  encoded  into  a  string  of  primitives,  but  only  those  local 
extrema  points  (points  which  are  tangent  to  one  of  the  8  vectors)  are  extracted.  The 
letter  "C"  could  be  encoded  as  3456701  as  shown  in  Figure  4.3(a).  To  handle  penlift,  we 
consider  the  vector  between  'pen-up'  and  its  successive  'pen-down'  and  choose  the  closest 
octal  primitive  with  a  bar  on  it.  Therefore  the  letter  "H"  could  be  encoded  as  6165  0  as 
shown  in  Figure  4.3(b). 

Let  us  call  the  above  described  encoding  scheme  EF(for  Extended  Freeman)  and  let 
D=  English  characters.  Using  the  same  sample  data  ,  a  dictionary  consisting  of  about  48 
words  in  compiled  in  an  ordering  0  >  1  >  ...  >6>7as  shown  in  Table  4.3.  In  this  dic- 
tionary, only  7  words  are  ambiguous  and  DegAmb(B,  Dm,  EF)=2,  Be{C,0,D,P,X,Y,T}. 
Further,  this  method  eliminates  the  possibility  of  misrecognizing  a  number  of  obviously 
unacceptable  symbols.  For  example, 

\/    I    \ 

A  I     V 

*     \ 

will  not  be  interpreted  as  the  valid  character  "E". 


0462050 

E 

076526 

P 

046 

T 

04620 

F 

0462050 

E 

0464 

J 

04640 

I 

161 

N 

21 

V 

2121 

W 

245670123406 

G 

2716 

M 

25 


3456701 

CO 

4560 

C 

4620 

F 

462050 

E 

4675 

S 

560 

C 

5670123 

CO 

567012367 

Q 

575 

s 

5T630 

A 

5T715 

N 

5T730 

A 

527 

XY 

61 

L 

602 

U 

6157 

K 

620657 

R 

62065765 

B 

6275 

DP 

6420 

J 

670 

L 

67012 

U 

6157 

K 

6T630 

H 

620657 

R 

62065765 

B 

620765 

DP 

6271 

N 

62716 

M 

6275 

DP 

627575 

B 

630 

T 

7171 

W 

72 

V 

7l5 

Y 

725 

X 

Table  4.3.  An  EF  Dictionary 


5.  Improved  Extented  Freeman  ApproachflEF] 


In  the  extended  Freeman  approach,  there  are  16  primitives,  namely  0,  1,  2,  ...,  7,  0, 
1,  ...,  7.  However,  we  find  that  this  large  number  of  primitives  is  not  required.  In  this 
section,  the  number  of  primitives  is  reduced  by  half  by  avoiding  use  of  0,  1,  ...,  7.  The 
efficiencies  remain  the  same.  In  fact,  some  advantages  are  gained  as  shown  in  Fig.  5.1. 


(3) 


(2)1 


(1) 
•-> 


(3) 


(2)|- 


(1) 
■-> 


(a) 


(b) 


Figure  5.1  (a)  EF(04620)        versus      (b)  IEF  (04620) 
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0462050 

046 

0462050 

04640 

21 

245670123406 

3456701 

4620 

4675 

5670123 

575 

51715 

527 

602 

620657 

6275 

670 

6157 

620657 

620765 

62715 

627575 

7171 

715 

Table 

6.   Off-Line  Character 


E  076526  P 

T  04620  F 

E  0464  J 

I  161  N 

V  2121  W 
G  2761  M 
CO  4560  C 
F  0462050  E 
S  560  C 
CO  567012367  Q 
S  51630  A 
N  51730  A 
XY  61  L 
U  6157  K 
R  62065765  B 
DP  6420  J 
L  67012  U 
K  61630  H 
R  62065765  B 
DP  6271  N 
M  6275  DP 
B  630  T 
W  72  V 

Y  725  X 
5.1.  An  DZF  dictionary 

Recognition 


In  this  section,  off-line  character  recognition  simulating  real  time  on-line  technique 
is  discussed  and  an  algorithm  for  transforming  two-dimensional  line  drawing  patterns  to 
parsing  sequences  based  on  the  "universal  array  grammar"  is  constructed. 
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In  conventional  syntactic  pattern  recognition,  each  class  of  patterns  is  represented  by 
a  grammar[6].  In  order  to  classify  or  recognize  an  input  pattern  given  different  classes  of 
patterns  characterized  by  different  grammars,  the  input  pattern  is  normally  transformed 
to  a  pattern  representation  through  preprocessing.  Then  it  is  compared  with  all  given 
grammars  and  is  classified  into  the  class  with  the  minimum  distance  measure,  provided 
such  a  distance  is  within  a  certain  threshold.  Such  an  approach  is  basically  structural 
and  hierarchical.  It  divides  a  rather  complicated  pattern  into  subpattems,  which  in  turn 
are  further  divided  into  subpattems  and  so  on,  until  primitives  are  found  which  can  no 
longer  be  divided.  It  is  structurally  sound  and  can  overcome  some  of  the  difficulties  of 
the  decision  theoretical  (classic)  approach  discussed  earlier. 

However,  when  the  number  of  classes  in  a  pattern  recognition  problem  is  very  high, 
pattern  matching  and  classification  involve  too  many  grammars  and  it  becomes  too  time 
consuming  to  be  practical.  Besides,  grammatical  inference  is  shown  N-P  complete  hard. 
Mainly  because  of  this,  the  concept  of  "universal  grammar"  was  proposed  in  [32]  for  on- 
line line-drawing  patterns,  with  classification  determined  by  the  parsing  sequence  rather 
than  by  grammars. 

Such  concept  of  "universal  grammar"  is  explored  for  off-line  line-drawing  patterns. 
We  use  the  model  of  "  array  grammar"  because  it  offers  several  advantages  over  other 
methods  in  dealing  with  two-dimensional  patterns  as  described  in  Section  3. 

*  *  *  * 

»  *  * 

*  * 

*  * 

♦  ***  **** 

(a),  character  'c'  (b) .  digit  '2' 


Figure  6.1.  patterns  'c'  and  '2' 
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Example  6.1.  Gl=(Vn,Vt ,P1 ,S ,#)      Example  6.2.  G2=(Vn.Vt ,P2,S ,#) 
Vn={S,Sl.S2,S3,S4,SB,S6,  Vn={S ,S1 ,S2,S3 ,S4,SB ,S6 , 

S7.S8.S9.S10}  S7,S8,S9,S10,S11> 


PI: 


0.   #  S  — >  SI   * 


P2: 


SI 


0.   S    — >  * 


SI        * 

1.   #    — >  S2 


1.   SI  #  — >  *  S2 


S2         * 

2.  #    «>   S3 

53  * 

3.  #   — >  S4 

54  * 

4.  #    ~>  SB 

SB      * 
B.   #   — >  S6 

S6       * 

6.  #  — >    S7 

7.  S7  #  ~>  *  S8 

8.  S8  #  ~>  *  S9 


2.  S2    ~>  * 

#        S3 

3.  S3    — >  * 

#  S4 

4.  S4   — >  * 

#  SB 

6.   SB    — >    * 

#  S6 

6.  S6    — >  * 

#  S7 

7.  S7    — >  * 

#  S8 

8.  S8  #  — >  *  S9 


9.   S9  #  — >  *  S10 


9.   S9  #  — >    *  S10 


10.   S10   — >  * 


10.   S10  #  — >   *  Sll      11.   Sll    — >   * 
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It  can  be  seen  from  Examples  6.1  and  6.2  that  arrays  generated  by  Gl  and  G2  are  the 
character  'C  and  digit  '2'  respectively.  Generally  speaking,  one  should  construct  different 
array  grammars  to  represent  different  patterns.  It  is  extremely  difficult,  if  not  impossi- 
ble, to  construct  adequate  grammars  for  all  patterns.  Therefore,  in  [32]  a  universal  line 
array  grammar  (ULAG)  was  proposed  to  generate  all  on-line  patterns.  For  recognition, 
it  utilizes  the  parsing  sequence,  not  the  grammar  itself,  to  distinguish  between  different 
classes  of  patterns. 

ULAG   Gu  =    (Vn.    Vt ,    P,    S,    #) ,    whore  Vn  =   {S}.    Vt   =   {a}   and 


P: 


0.  S   #  — >      a  S 

#  S 

1.  S  — >     a 

#  S 

2.  S  — >      a 

#  S 

3.  S  — >          a 

4.  #  S  — >     S  a 

S  a 

B.      #  -->      S 

S  a 

6.  #  — >      S 

S  a 

7.  #  S 

8.  S  —  >      a 
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In  this  universal  line  array  grammar,  a  code  is  associated  with  each  production  rule 
except  the  terminal  ride,  and  will  be  produced  as  part  of  the  parsing  sequence  if  the 
corresponding  rule  is  successfully  applied. 

Using  ULAG,  the  parsing  sequence  of  'C  is  4,  5,  5,  6,  6,  6,  7,  0,  0,  0,  8,  and  the 
parsing  sequence  of  '2'  is  2,  0,  7,  6,  5,  5,  5,  6,  0,  0,  0,  8. 

The  above  concept  can  be  extended  to  off-line  character  recognition. 
OAG  G   =    (Vn,    Vt,    P,    S.    #) ,   ohere   Vn  =   {S,    SI},   Vt   =  {a}   and 


P  = 


#  S 

0.      S   #      — >      SI    S  1.         S  — >      SI 


#  s  «  s 

2.      S  —  >     SI  3.  S  — >  SI 


S  SI 

4.      #   S      — >      S      SI  B.         #  — >      S 


S  SI  S  SI 

S.      #  — >      S  7.  #  — >  S 


8.       SI         — >      S  9.         S  — >      a 

Example   6.3. 

We   can  use  UAG   above   to  get   the  parsing   sequence  for  English  letter  E: 
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a  a  a 

a 
a  a 


as  follows. 


4        8        9       4  8  9  8 

#  S  ==>  S  SI  ==>  S  S  ==>  S  a  ==>  S  SI  a  ====>  S  a  a  ==>  SI  a  a 

S 

89  6         890  9  8 

====>  a  a  a  ==>  a  a  a   ====>  a  a  a  ====>  a  a  a  ====>  a  a  a 
S          SI           a  a  a 

S  SIS  Sla         S  a 

6  86898689  80898089 

====>  a  a  a  ========>  a  a  a  =========>  a  a  a 

a  a  a 

a  a  a  a  a  a 

SI  a  a 

SI  a  a  a 

In  the  above  example,  we  used  UAG  parsing  sequence  to  represent  a  pattern,  but 
these  sequences  are  rather  long  because  some  nonterminal  to  nonterminal  parsing  and 
terminal  rules  were  involved.  Now,  we  propose  an  algorithm,  PATSEQ,  which  produces 
an  unique  shorter  parsing  sequence  from  a  pattern.  In  order  to  describe  this  algorithm, 
some  definitions  are  required. 

Definition  6.1.   The  neighbors  of  a  pixel,  pO,  are  identified  by  the  eight  directors,  pi, 
p2,  ...,  p8  shown  in  Figure  6.2. 
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pB  p6  p7 
p4  pO  p8 
p3  p2  pi 

Figure   6.2.    Pixel  pO  and  its  neighbors 

Definition  6.2.  The  segment  of  a  pixel,  pO,  is  a  set  of  neighbors  of  pO  which  are 
consecutive. 

In  the  example  below,  there  are  two  segments,  Si  and  S2,  for  the  following  pixel  pO, 
where  Si  =(p5,p6,p7),  S2  =  (p2,p3). 

p5  p8  p7 

pO 
p3  p2 

Later,  we  use  Si  to  indicate  the  segment  i.  The  length  of  a  segment,  length(Si),  is  equal 
to  the  number  of  elements  in  the  segment  Si.  For  example,  length(Sl)  =  4,  length(S2)  = 
2. 

Definition  6.3.  A  segment  is  perfect  if  it  satisfies  the  following  conditions: 
i)  length(Si)  <  4,  and 
ii)  pi  and  pj  are  not  in  one  segment,  where  i=2,4,6,8.  j=(i+2)  mod  8. 

Definition  6.4.  The  center  pixel  of  a  perfect  segment,  Ci,  is  as  follows: 

i)  If  length(Si)  =  1  then  Ci  is  the  only  element  in  Si. 

ii)  if  2  <  length(Si)  <  3  then  Ci=pk,  pk  is  one  of  the  elements  of  Si  (k=2  or  4  or  6  or 
8).  We  also  call  Ci  as  the  next  pixel  of  pO. 

Definition  6.5.  The  parsing  code  from  the  current  pixel,  pO,  to  one  of  its  next  pixels  is 
the  number  of  the  rule  successfully  applied. 

Algorithm    PATSEQ:  transfer  a  pattern  to  a  parsing  sequence. 
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Input:  a)  Initial  position  (10, TO). 

b)  Digitized  pattern  in  which  all  segments  are  perfect. 
Output:  A  parsing  sequence. 
Method: 

Step  1:  push(XO);  push(YO) ;  Q:=empty;(Q  stores  the  passed  pixels). 

Step  2:  repeat 

Step  3:      if  stack  is  empty  then  step  7. 
T:=pop;  I:=pop; 
if  (I,Y)  is  in  Q  then  step  3; 

Step  4:     push  all  next  pixels  of  (I.Y)  and  their  parsing  codes 
into  stack  if  no  next  pixel  for  P0  then  push(-l); 

Step  5:     Delete  current  pixel  from  pattern. 

Step  6:      z:=pop;  if  z=-l  then  step  6  else  print (z ) . 

Step  7:  until  stack  is  empty. 

The  following  example  illustrates  the  use  of  the  algorithm  PATSEQ. 

1   2   3  4   E   6   7 
>  x 

1 
2 
3 
4 
B 

6  v 
We  use  character  a,b,...,  to  indicate  positions  (4,2) , (4,3) , (5,3) , 
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b 

d 

i  h  e   f   g 


The  parsing  process  of   character    '  r    '    is   as    follows. 


Pixel  Stack 


Output 


1 

a 

2 

a 

b 

6 

a 

b 

6 

3 

b 

d 

6 

c    0 

a 

b 

c 

d 

0 

4 

c 

d 

6 

-1 

a 

b 

c 

d 

6 

5 

d 

e 

6 

a 

b 

c 

d 

e 

6 

6 

e 

h 

4 

f    0 

a 

b 

c 

d 

e 

f 

h 

0 

7 

f 

h 

4 

g  o 

a 

b 

c 

d 

e 

f 

g  *» 

0 

8 

g 

h 

4 

-1 

a 

b 

c 

d 

e 

f 

g* 

4 

9 

h 

i 

4 

a 

b 

c 

d 

e 

f 

g  h  i 

4 

10 

i 

-1 

a 

b 

c 

d 

e 

f 

g  h  i 

Please  note  that  in  this  method,  each  parsing  sequence  code  represents  a  line  segment 
vector  that  can  cover  -22.5  to  +22.5  degrees. 

Example  6.4.  The  letter  "L"  written  in  the  following  ways  are  all  transformed  in 
the  same  parsing  sequence  "66600".  Therefore,  only  one  learning  sample  is  necessary  for 
this  letter,  and  it  takes  only  one  address  in  the  dictionary.  This  address  can  be  quickly 
retrieved  during  the  recognition  pattern  matching  process. 


This  technique  significantly  lowers  the  number  of  learning  samples  and  the  size  of  the 
dictionary.  It  also  saves  pattern  matching  time  because  each  parsing  sequence  can  actu- 
ally function  as  a  hashing  code  that  serves  as  an  address,  which  can  represent  a  rather 
large  class  of  patterns  in  the  dictionary. 
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7.   Discussions  and  Conclusions 

We  introduced  the  basic  concept  of  degrees  of  recognizability  and  ambiguity.  Their 
relationship  has  been  discussed.  Although  there  are  many  other  factors  such  as  human 
factors,  social  environment,  writing  instruments,  software  and  hardware  environment 
as  well  as  algorithm-oriented  characteristics  such  as  segmentation,  data  reduction,  res- 
olution, primitive  selection,  thresholding  and  quantization,  nevertheless,  the  degree  of 
ambiguity  plays  an  inherent  role  as  an  important  criterion  of  recognizability  for  hand- 
written symbols. 

Two  different  categories  of  experiments  representing  different  recognition  schemes  for 
on-line  handwritten  symbols  were  conducted.  The  first  used  basic  structural  shapes  as 
primitives  while  the  second  used  octal  chain  codes  as  primitives.  The  second  method  pro- 
vided advantages  of  flexible  sizes,  orientation,  variations,  and  the  need  for  fewer  learning 
samples.  It  also  possesses  an  inherently  lower  degree  of  ambiguity.  Besides,  in  this 
method,  it  is  less  likely  to  mis-recognize  an  obviously  unacceptable  symbol  as  a  valid  one. 
A  summary  is  depicted  in  Table  7.1. 


I              |                          I       Accuracy 

|    Grammar    1         Structure          1  (degrees  of  ambiguity 

I              |                           I  and  recognizability) 

TG    I    Simpler     1      Less   Flexible        1      Low 

AG    |  More  Complex  1       More  Flexible        1      High 

PDL   I  Context  Free  1         Flexible           1     Medium 

BM    I      I  Flexible  (Fewer  Primitives)!      More  Ambiguous 

EF    I      I  Flexible  (More  Primitives)  1      Less  abmiguous 

IEF   1      I  Flexible  (Fes  Primitives)   1      Less  ambiguous 

where  TG:    Tree  Grammar,    AG:    Array  Grammar,    PDL:    Picture  Descriptive  Language 
BM:    Berthod  and  Maroy,    EF:    Extended  Freeman,    IEF:    Improved  EF 

Table  7.1        A   summary  of   comparisons  of   different  methods 
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